Load packages

library(tidyverse)
library(patchwork)
library(dplyr)
library(tidyr)
library(ggplot2)

Introduction

In this case study, we seek to determine the best bio-markers of recent THC use within a 3-hour period by evaluating compounds across three matrices: blood, oral fluid, and breath. Additionally, we aim to determine the optimal THC concentration cutoffs for each matrix within this timeframe and to examine how treatment groups may influence THC levels. This analysis is important given the legal implications of THC detection in impaired driving cases. Nineteen states have implemented per se or zero-tolerance laws for THC, with per se limits varying between 1-5 ng/mL THC in blood 1. These limits suggest significant regulatory interest in identifying reliable biological markers of recent cannabis use, particularly in roadside testing contexts.

The study’s data set includes three matrices–blood, oral fluid, and breath–with a total of 573 participants across them. Blood and oral fluid matrices allow detection of multiple THC-related compounds (8 in blood, 7 in oral fluid), while breath analysis focuses solely on one compound. Blood and oral fluid matrices have previously demonstrated relatively stable THC levels over time, providing a basis for exploring how they reflect THC concentrations within a 3-hour window 2. Breath testing, however, remains a newer field of investigation, and its reliability for THC detection within a short timeframe is still under review. Identifying the optimal biological matrix for THC detection in recent use is important for law enforcement to establish reliable, science-based standards for roadside impairment testing.

The analysis also seeks to address the legal challenges posed by varying state regulations. For example, states like Colorado utilize a “reasonable inference” approach, where blood concentrations above 5 ng/ml THC suggest impairment 3. These differing legal standards show the importance of identifying clear, evidence-based cutoff values for THC detection across different matrices. In this case study, we aim to determine which matrix is most effective for detecting recent THC use within a 3-hour window and to identify the optimal cutoff values for each matrix. This analysis will help us understand how different matrices perform in detecting THC levels over time and provide insights into their potential effectiveness in practical settings.

Questions

  1. Which matrix–blood, oral fluid, or breath–is most effective for detecting recent THC use within a 3-hour window?
  2. What is the optimal cutoff within this 3-hour timeframe for detecting THC in each matrix?
  3. How does the treatment group (Placebo, 5.9% THC, or 13.4% THC) affect the detection of recent THC use in each matrix?

The Data

The data set that we will be using for our case study is provided to us by Professor Ellis in our data/ folder. The data provided will be in the form of csv files.

Data Import

In order to begin our analysis, we imported our raw data of whole blood, breath, and oral fluids from our data/ folder.

WB <- read_csv("data/Blood.csv")
BR <- read_csv("data/Breath.csv")
OF <- read_csv("data/OF.csv")

Data Wrangling

Now that we have our data properly imported, we need to wrangle our data in R before starting our analysis.

To begin our data wrangling, we imported the data as tables and modified the columns, transforming the data to improve its structure. There are three data set tables for each fluid type, blood, oral, and breath, in which we reformatted the entries in the Treatment column for better readability. We also converted all column names to snake case to simplify future coding. We renamed the columns for all compounds in each data set to ensure clear differentiation. Note that not every fluid type contains all of the compounds. To create time intervals, we turned the time_from_startcolumn into hour intervals in timepoint.

We start with the Whole Blood table.

WB <- WB |> 
  mutate(Treatment = fct_recode(Treatment, 
                                "5.9% THC (low dose)" = "5.90%",
                                "13.4% THC (high dose)" = "13.40%"),
         Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)")) |> 
  janitor::clean_names() |>
  rename(thcoh = x11_oh_thc,
         thccooh = thc_cooh,
         thccooh_gluc = thc_cooh_gluc,
         thcv = thc_v) |>
  mutate(timepoint = case_when(time_from_start < 0 ~ "pre-smoking",
                               time_from_start > 0 & time_from_start <= 60 ~ "0-60 min",
                               time_from_start > 61 & time_from_start <= 120 ~ "0-120 min",
                               time_from_start > 121 & time_from_start <= 180 ~ "0-180 min",
                               time_from_start > 180 ~ "181+ min"))

Then, the Oral Fluid table.

OF <- OF |> 
  mutate(Treatment = fct_recode(Treatment, 
                                "5.9% THC (low dose)" = "5.90%",
                                "13.4% THC (high dose)" = "13.40%"),
         Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)")) |> 
  janitor::clean_names() |>
  rename(thcoh = x11_oh_thc,
         thcv = thc_v,
         fluid_type = fluid) |>
  mutate(timepoint = case_when(time_from_start < 0 ~ "pre-smoking",
                               time_from_start > 0 & time_from_start <= 60 ~ "0-60 min",
                               time_from_start > 61 & time_from_start <= 120 ~ "0-120 min",
                               time_from_start > 121 & time_from_start <= 180 ~ "0-180 min",
                               time_from_start > 180 ~ "181+ min"))

Last, but not least, the Breath table.

BR <- BR |> 
  mutate(Treatment = fct_recode(Treatment, 
                                "5.9% THC (low dose)" = "5.90%",
                                "13.4% THC (high dose)" = "13.40%"),
         Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)")) |> 
  janitor::clean_names() |>
  rename(thc = thc_pg_pad,
         fluid_type = fluid) |>
  mutate(timepoint = case_when(time_from_start < 0 ~ "pre-smoking",
                               time_from_start > 0 & time_from_start <= 60 ~ "0-60 min",
                               time_from_start > 61 & time_from_start <= 120 ~ "0-120 min",
                               time_from_start > 121 & time_from_start <= 180 ~ "0-180 min",
                               time_from_start > 180 ~ "181+ min"))

After cleaning our data, we combined the three dataset tables into a single data set and stored in a new csv.

combined <- bind_rows(list(WB, OF, BR))
write_csv(combined, "data/combined_data.csv")

Exploratory Data Analysis (EDA)

To get a broad view of compound level in our dataset, we plotted compound levels over time within each fluid type to observe any changes and trends across different time intervals.

First we convert compound data to a long format for easier potting of concentration trends across time and compounds.

combined_long <- combined |> 
  select(1:5,time_from_start,everything()) |>
  pivot_longer(7:15)

Now that the data is in long format, we define a function to visualize compound concentration level over time for different fluid types. This function allows us to filter the data by a specific matrix (such as WB for whole blood, OF for oral fluid, or BR for breath) and pot scatter points of compound concentration (y-axis) against time from start (x-axis).

plot_scatter_time <- function(matrix) {
  combined_long |> 
      filter(!is.na(time_from_start), fluid_type==matrix) |>
      ggplot(aes(x=time_from_start, y=value, color=group)) + 
        geom_point() +
        facet_wrap(~name, scales="free") +
        scale_color_manual(values=c("#19831C", "#A27FC9")) +
        theme_bw() +
        labs(title=paste("Compound Levels Across Time for", matrix),
             x="Time From Start (min)",
             y="Concentration (pg/mL)") +
        theme(
              legend.position="bottom",
              legend.title=element_blank(),
              strip.background=element_blank(),
              plot.title.position="plot")
}

Our plot_scatter_time function to visualize compound concentration over time in whole blood (WB). This will help us identify trends in compound levels as time progresses within this specific fluid type.

plot_scatter_time(matrix = "WB")

Each plot represents a different compound, with frequent users in green and occasional users in purple. THC levels peak shortly after the start time, particularly for frequent users, and then decline quickly. Other compounds, like THCCOOH and THCCOOH_Gluc, stay more stable over time. Overall, frequent users tend to have higher concentrations across compounds compared to occasional users.

Next, we generate a similar plot for oral fluid (OF). This comparison will allow us to observe any unique patterns or differences in compound concentration across time interval to oral fluid samples.

plot_scatter_time(matrix = "OF")

THC levels show a sharp increase near starting time, particularly for less experienced users, before decreasing. Compounds like THCA_A and CBG also peak briefly before leveling out. Unlike whole blood, oral fluid concentrations are generally lower and decrease more rapidly. Less experienced users tend to have higher peak levels across several compounds, indicating possible differences in compound metabolism between groups.

Finally, we produce the plot for compound concentrations overall time within the breath (BR) fluid type.

plot_scatter_time(matrix = "BR")

THC showed a sharp but short peak near the start time, especially for experienced users, and levels quickly returned to baseline. Most of the other compounds were barely detectable throughout the time period, suggesting minimal presence in the breath samples. The lack of change in compounds over time suggests that breath may be less effective at tracking these compounds compared to other matrices.

boxplot_treatment <- function(matrix) {
  combined_long |> 
    # Filter for the specified matrix, compound "THC", and relevant timepoints
    filter(fluid_type == matrix,
           name == "thc",
           timepoint %in% c("pre-smoking", "0-60 min", "0-120 min", "0-180 min", "181+ min")) |>
     mutate(timepoint = factor(timepoint, levels = c("pre-smoking", "0-60 min", "0-120 min", "0-180 min", "181+ min")),
           treatment = factor(treatment, levels = unique(combined_long$treatment))) |>
    ggplot(aes(x = treatment, y = value, fill = group)) + 
    
      # Add jittered points for individual measurements
      geom_jitter(position = position_jitter(width = 0.3, height = 0), 
                  size = 0.8, 
                  color = "gray50") +
      geom_boxplot(outlier.size = 0.1, alpha = 0.8) +
      
      # Facet by timepoint for each specified interval
      facet_wrap(~ timepoint, scales = "free") +
  
      scale_x_discrete(labels = function(x) str_wrap(x, width = 3)) +
      scale_y_continuous(limits = c(0, NA), expand = expansion(mult = c(0, 0.1))) +
      scale_fill_manual(values = c("#19831C", "#A27FC9")) +
      theme_classic() +
      labs(title = paste("THC Levels by Treatment for", matrix), 
           y = "Measurement (ng/mL)") + 
      
      # Customize text size, legend position, and remove unnecessary panel elements
      theme(text = element_text(size = 10),
            legend.position = "bottom",
            legend.title = element_blank(),
            panel.grid = element_blank(),
            axis.line=element_line(),
            plot.title.position = "plot",
            strip.background = element_blank(),
            strip.text = element_text(size = 12))
}
boxplot_treatment("WB")

From each compound, we have

ggplot(combined, aes(x = time_from_start, y = thc)) +
  geom_point(size = 1, color = "lightgreen") +
  facet_wrap(~ fluid_type, scales = "free_y") +

  labs(
    title = "THC Levels across Time in each Fluid Type",
    x = "Time from Start (minutes)",
    y = "Concentration (pg/mL)",
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "top"
  )

We are categorizing THC levels into ranges to simplify the analysis and help identify where THC concentrations fall over time across different fluids. This approach will give us a clearer view of any trends in THC distrbution, making it easier to compare patterns across fluid types and timepoints.

# Bin THC levels into specified ranges
combined <- combined  |>
  filter(!is.na(thc), !is.na(timepoint)) |>
  mutate(
    thc_bin = cut(thc, breaks = c(0, 1, 2, 3, 4, 5, 6, 7, 8),
                  labels = c("0-1", "1.1-2", "2.1-3", "3.1-4", "4.1-5", "5.1-6", "6.1-7", "7.1-8"),
                  include.lowest = TRUE)
  ) |>
  filter(!is.na(thc_bin))

# Count the number of THC levels in each bin, grouped by fluid_type and timepoint
thc_counts <- combined  |>
  group_by(fluid_type, timepoint, thc_bin) |>
  summarize(count = n(), .groups = 'drop')

ggplot(combined , aes(x = thc_bin)) +
  geom_bar(aes(fill = timepoint), position = "stack", color = "black") +
  facet_wrap(~ fluid_type) +
  labs(title = "THC Level Distribution by Matrix and Timepoint",
       x = "THC Level Range (ng/mL)",
       y = "Count") +
  scale_fill_viridis_d(option = "plasma") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Analysis

ggplot(combined, aes(x = timepoint, y = thc, color = treatment, group = treatment)) +
  geom_line() +
  geom_point(position = position_jitter(width = 0.2), size = 2) +  # Add jitter for better visibility
  facet_wrap(~ fluid_type, scales = "free_y") +  # Create separate plots for each fluid type
  labs(
    title = "THC Levels Across Fluid Types Over Time",
    x = "Time Interval",
    y = "THC Level",
    color = "Dosage"
  ) +
  theme_minimal(base_size = 14) +  # Increase base font size for readability
  theme(
    plot.title = element_text(hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),  # Rotate x-axis labels for readability
    legend.position = "top"
  )

ggplot(combined %>% filter(fluid_type == "WB" & time_from_start >= 0 & time_from_start <= 180), aes(x = time_from_start, y = thc, color = treatment, group = treatment)) +
  geom_line() +
  geom_point(position = position_jitter(width = 0.2), size = 2) +  # Add jitter for better visibility
  labs(
    title = "THC Levels Across Whole Blood Over Time",
    x = "Time Interval",
    y = "THC Level",
    color = "Dosage"
  ) +
  theme_minimal(base_size = 14) +  # Increase base font size for better readability
  theme(
    plot.title = element_text(hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),  # Rotate x-axis labels
    legend.position = "bottom"
  )

ggplot(combined %>% filter(fluid_type == "WB" & time_from_start >= 0 & time_from_start <= 60), aes(x = time_from_start, y = thc, color = treatment, group = treatment)) +
  geom_line() +
  geom_point(position = position_jitter(width = 0.2), size = 2) +  # Add jitter for better visibility
  labs(
    title = "THC Levels Across Whole Blood Over Time",
    x = "Time Interval",
    y = "THC Level",
    color = "Dosage"
  ) +
  theme_minimal(base_size = 14) +  # Increase base font size for better readability
  theme(
    plot.title = element_text(hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),  # Rotate x-axis labels
    legend.position = "bottom"
  )

ggplot(combined %>% filter(fluid_type == "WB" & time_from_start >= 0 & time_from_start <= 60), aes(x = time_from_start, y = thc, color = treatment, group = treatment)) +
  geom_line() +
  geom_point(position = position_jitter(width = 0.2), size = 2) +
  facet_wrap(~treatment) +
  labs(
    title = "THC Levels Across Whole Blood Over Time",
    x = "Time Interval",
    y = "THC Level",
    color = "Dosage"
  ) +
  theme_minimal(base_size = 14) +  # Increase base font size for better readability
  theme(
    plot.title = element_text(hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),  # Rotate x-axis labels
    legend.position = "bottom"
  )

ggplot(combined %>% filter(fluid_type == "OF" & time_from_start >= 0 & time_from_start <= 60), aes(x = time_from_start, y = thc, color = treatment, group = treatment)) +
  geom_line() +
  geom_point(position = position_jitter(width = 0.2), size = 2) +
  facet_wrap(~treatment) +
  labs(
    title = "THC Levels Across Oral Fluid Over Time",
    x = "Time Interval",
    y = "THC Level",
    color = "Dosage"
  ) +
  theme_minimal(base_size = 14) +  # Increase base font size for better readability
  theme(
    plot.title = element_text(hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),  # Rotate x-axis labels
    legend.position = "bottom"
  )

Specificity and Sensitivity

As we focus on THC as the main compound to our study for recent THC use, it would be best to explore the [bold] specificity [bold] pf THC as we want to be the most certain that using knowledge that they are either a frequent or occasional user, are within the time interval of one hour, and tested through WB, we are more likely to identify with high specificity if the individual has used in the past hour (less false positives).

With respect to a study that found the highest accuracy of yielding 96% for a “molar metabolite ratio 1”, the study had measured whole blood THC and metabolites in the time interval of 30 minutes “after starting a 15-minute interval” of cannabis intake, where they had found with THC alone the cut-point yielded “88% specificity, 73% sensitivity, and 80% accuracy” for identifying recent cannabis use.

Within our study, we want to utilize the focus for specificity as we calculate our data to include the cutoff of similarly to their cut-point of 5.3 mu grams/L, which is 5.3 ng/mL. In the study, it has yielded around 75% specificity and sensitivity, so this will be our baseline with the data we have currently. As the data displays THC concentration of

Change Combined Data to include only “Frequent user” and “Occassional user” instead of “experienced” and “non-experienced”

combined_user_long <- combined_long |>
  mutate(group = fct_recode(group, "Frequent user" = "Experienced user", "Occasional user" = "Not experienced user"))

Range, Median, Mean of THC in Whole Blood

wbthc_range <- range(WB$thc)
wbthc_median <- median(WB$thc, na.rm = TRUE)
wbthc_mean <- mean(WB$thc, na.rm = TRUE)

WBTHC_summary <- list(
  range = wbthc_range,
  median = wbthc_median,
  wbthc_mean
)

print(WBTHC_summary)
## $range
## [1]   0.0 230.6
## 
## $median
## [1] 1.3
## 
## [[3]]
## [1] 5.980217

Calculation Function for Specificity and Sensitivity

make_calculations <- function(dataset, cutoff, compound, timepoint_use){
  ## remove NAs
  df <- dataset |>
    select(treatment, {{ compound }}, timepoint) |>
    filter(timepoint == timepoint_use, !is.na({{ compound }}))

  if(nrow(df)>0){
    if(timepoint_use == "pre-smoking"){
      output <- df |> 
        summarize(TP = 0,
                  FN = 0,
                  FP = sum(!!sym(compound) >= cutoff),
                  TN = sum(!!sym(compound) < cutoff)) 
    }else{
      if(cutoff == 0){
        output_pre <- df |> 
          filter(timepoint_use == "pre-smoking") |>
          summarize(TP = 0,
                    FN = 0,
                    FP = sum(!!sym(compound) >= cutoff),
                    TN = sum(!!sym(compound) < cutoff)) 
        
        output <- df |> 
          filter(timepoint_use != "pre-smoking") |>
          summarize(TP = sum(treatment != "Placebo" & !!sym(compound) > cutoff),
                    FN = sum(treatment != "Placebo" & !!sym(compound) <= cutoff),
                    FP = sum(treatment == "Placebo" & !!sym(compound) > cutoff),
                    TN = sum(treatment == "Placebo" & !!sym(compound) < cutoff))
        
        output <- output + output_pre
      }else{
        output_pre <- df |> 
          filter(timepoint_use == "pre-smoking") |>
          summarise(TP = 0,
                    FN = 0,
                    FP = sum(!!sym(compound) >= cutoff),
                    TN = sum(!!sym(compound) < cutoff)) 
        
        output <- df |> 
          filter(timepoint_use != "pre-smoking") |>
          summarise(TP = sum(treatment != "Placebo" & !!sym(compound) >= cutoff),
                    FN = sum(treatment != "Placebo" & !!sym(compound) < cutoff),
                    FP = sum(treatment == "Placebo" & !!sym(compound) >= cutoff),
                    TN = sum(treatment == "Placebo" & !!sym(compound) < cutoff))
        
        output <- output + output_pre
      }
    }
  
  # clean things up; make calculations on above values
  output <- output |>
    mutate(detection_limit = cutoff,
           compound = compound,
           time_window = timepoint_use,
           NAs = nrow(dataset) - nrow(df),
           N = nrow(dataset),
           Sensitivity = (TP/(TP + FN)), 
           Specificity = (TN /(TN + FP)),
           PPV = (TP/(TP+FP)),
           NPV = (TN/(TN + FN)),
           Efficiency = ((TP + TN)/(TP + TN + FP + FN))*100
    )
  return(output)
}
}

Map for each matrix

sens_spec_cpd <- function(dataset, cpd, timepoints){
  args2 <- list(start = timepoints$start, 
                stop = timepoints$stop, 
                tpt_use = timepoints$timepoint)
  out <- args2 |> 
    pmap_dfr(make_calculations, dataset, compound = cpd)
  return(out)
}

Choosing the cutoffs for THC, we used 5 ng/mL as our baseline representation.

# specify which calculations to make
cutoffs <- c(0.5, 1, 5, 7.5, 10)
compounds <- combined_user_long |> filter(fluid_type=="WB") |> filter(!is.na(value)) |> distinct(name) |> pull(name)
WB_timepoints <- c("pre-smoking", "0-60 min", "0-120 min", "181+ min")
WB <- combined |> filter(fluid_type=="WB")

# Specify all parameter combinations
param_grid <- expand.grid(
  cutoffs = cutoffs,
  compounds = compounds,
  timepoint_use = WB_timepoints)


# Calculate for all cutoff-compound-timepoint combinations
WB_ss <- purrr::pmap_dfr(param_grid, ~ make_calculations(dataset=WB, cutoff = ..1, compound = as.character(..2), timepoint_use = ..3))

WB_ss
## # A tibble: 160 × 14
##       TP    FN    FP    TN detection_limit compound time_window   NAs     N
##    <dbl> <dbl> <int> <int>           <dbl> <chr>    <fct>       <int> <int>
##  1     0     0     1   185             0.5 cbn      pre-smoking  1119  1305
##  2     0     0     0   186             1   cbn      pre-smoking  1119  1305
##  3     0     0     0   186             5   cbn      pre-smoking  1119  1305
##  4     0     0     0   186             7.5 cbn      pre-smoking  1119  1305
##  5     0     0     0   186            10   cbn      pre-smoking  1119  1305
##  6     0     0     4   182             0.5 cbd      pre-smoking  1119  1305
##  7     0     0     1   185             1   cbd      pre-smoking  1119  1305
##  8     0     0     0   186             5   cbd      pre-smoking  1119  1305
##  9     0     0     0   186             7.5 cbd      pre-smoking  1119  1305
## 10     0     0     0   186            10   cbd      pre-smoking  1119  1305
## # ℹ 150 more rows
## # ℹ 5 more variables: Sensitivity <dbl>, Specificity <dbl>, PPV <dbl>,
## #   NPV <dbl>, Efficiency <dbl>

Sensitivity and Specificity Plot

plot_cutoffs <- function(dataset, timepoint_use_variable, tissue, cpd){
    # control colors and lines used in plots
    col_val = c("#D9D9D9", "#BDBDBD", "#969696", "#636363", "#252525")
    lines = rep("solid", 5)
    
    # prep data
    df_ss <- dataset |> 
      filter(compound == cpd) |>
      mutate(time_window = fct_relevel(as.factor(time_window), levels(timepoint_use_variable)),
             detection_limit = as.factor(detection_limit),
             Sensitivity =  round(Sensitivity*100, 0),
             Specificity =  round(Specificity*100, 0))        
      
    # plot sensitivity
    p1 <- df_ss |> 
      ggplot(aes(x = time_window, y = Sensitivity, 
                 color = detection_limit)) + 
      geom_line(linewidth = 1.2, aes(group = detection_limit, 
                                linetype = detection_limit)) + 
      geom_point(show.legend=FALSE) + 
      ylim(0,100) +
      scale_x_discrete(labels = function(x) str_wrap(x, width = 5)) +
      scale_linetype_manual(values=lines) +
      scale_color_manual(values = col_val, name = "Cutoff \n (ng/mL)",
                        guide = guide_legend(override.aes = list(linetype = c(1),
                        shape = rep(NA, length(lines))) )) +
      theme_classic() +
      theme(plot.title.position = "plot",
            axis.title = element_text(size=14),
            axis.text = element_text(size=10),
            legend.position = "none", 
            panel.grid = element_blank(),
            strip.background = element_blank()
            ) +
      guides(linetype = "none") +
      labs(x = "Time Window (min)", 
           y = "Sensitivity", 
           title = paste0(tissue,": ", toupper(cpd)) )
  
  # plot specificity
  p2 <- df_ss |> 
      ggplot(aes(x = time_window, y = Specificity,
                 group = detection_limit, 
                 color = detection_limit, 
                 linetype = detection_limit)) + 
      geom_line(linewidth = 1.2) +
      geom_point() + 
      ylim(0,100) +
      scale_color_manual(values = col_val) +
      scale_x_discrete(labels = function(x) str_wrap(x, width = 5)) +
      scale_linetype_manual(values = lines, 
                            guide = guide_legend(override.aes = list(linetype = "solid",
                                                                     shape = rep(NA, length(lines))) )) +
      theme_classic() +
      theme(axis.title = element_text(size=14),
            axis.text = element_text(size=10),
            legend.position = c(0.35, 0.25),
            panel.grid = element_blank(),
            strip.background = element_blank()) +
      labs(x = "Time Window", 
           y = "Specificity",
           title = "" )
  
  # combine plots (uses patchwork)
  p1 + p2
}

Whole Blood Content for THC Cutoff Plot Over Time

plot_cutoffs(dataset=WB_ss, 
             timepoint_use_variable=WB$timepoint, 
             tissue="Blood", 
             cpd="thc")

We can see that with our plot of specificity, 5 ng/mL and more of THC in blood yield over 90% in specificity and over 73% sensitivity to identify individuals who do not have any THC in the system, indicating that similar to prior research, there is validity to the concentration of 5.3 ng/mL for THC to be likely to be identifiable of THC and can provide decisions on when to exclude false positives of THC in system. With sensitivity, although high in the first hour, the time periods drop directly after to lower than 25% sensitvity after 3 hours, so we want to note for future concentrations that have an increase dose of 5 ng/mL may opt for higher sensitivity.

Now, to compare with other matrix’s to see if THC has a higher sensitivity and specificity compared to Oral Fluid and Breath.

Oral Fluid

# specify which calculations to make
cutoffs <- c(0.5, 1, 5, 7.5, 10)
compounds <- combined_user_long |> filter(fluid_type=="OF") |> filter(!is.na(value)) |> distinct(name) |> pull(name)
OF_timepoints <- c("pre-smoking","0-60 min","0-120 min",
                   "181+ min") 
OF <- combined |> filter(fluid_type=="OF")

# Specify all parameter combinations
param_grid <- expand.grid(
  cutoffs = cutoffs,
  compounds = compounds,
  timepoint_use = OF_timepoints)

# Calculate for all cutoff-compound-timepoint combinations
OF_ss <- purrr::pmap_dfr(param_grid, ~ make_calculations(dataset=OF, cutoff = ..1, compound = as.character(..2), timepoint_use = ..3))

OF_ss
## # A tibble: 140 × 14
##       TP    FN    FP    TN detection_limit compound time_window   NAs     N
##    <dbl> <dbl> <int> <int>           <dbl> <chr>    <fct>       <int> <int>
##  1     0     0     5   187             0.5 cbn      pre-smoking   430   622
##  2     0     0     1   191             1   cbn      pre-smoking   430   622
##  3     0     0     1   191             5   cbn      pre-smoking   430   622
##  4     0     0     0   192             7.5 cbn      pre-smoking   430   622
##  5     0     0     0   192            10   cbn      pre-smoking   430   622
##  6     0     0     4   188             0.5 cbd      pre-smoking   430   622
##  7     0     0     1   191             1   cbd      pre-smoking   430   622
##  8     0     0     0   192             5   cbd      pre-smoking   430   622
##  9     0     0     0   192             7.5 cbd      pre-smoking   430   622
## 10     0     0     0   192            10   cbd      pre-smoking   430   622
## # ℹ 130 more rows
## # ℹ 5 more variables: Sensitivity <dbl>, Specificity <dbl>, PPV <dbl>,
## #   NPV <dbl>, Efficiency <dbl>
plot_cutoffs(dataset=OF_ss, 
             timepoint_use_variable=OF$timepoint, 
             tissue="Oral Fluid", 
             cpd="thc")

When looking at Oral Fluid, we can see that for 5 ng/mL, the trend of high sensitivity and specificity continues as for the presmoking period in specificity and for the first hour for both all yield high percentages, indicating that oral fluid also contributes to having a high likelihood to detect THC even in small amounts. However, although sensitivity is higher for all concentrations in the first hour, there is a large falloff in the first hour, distinctly in detection of false positives in specificity for lower doses within the hour interval, which does not yield more accurate results compared to whole blood.

Breath Level

# specify which calculations to make
cutoffs <- c(0.5, 1, 5, 7.5, 10)
compounds <- combined_user_long |> filter(fluid_type=="BR") |> filter(!is.na(value)) |> distinct(name) |> pull(name)
BR_timepoints <- c("pre-smoking","0-60 min","0-120 min",
                   "181+ min") 
BR <- combined |> filter(fluid_type=="BR")

# Specify all parameter combinations
param_grid <- expand.grid(
  cutoffs = cutoffs,
  compounds = compounds,
  timepoint_use = OF_timepoints)

# Calculate for all cutoff-compound-timepoint combinations
BR_ss <- purrr::pmap_dfr(param_grid, ~ make_calculations(dataset=BR, cutoff = ..1, compound = as.character(..2), timepoint_use = ..3))

BR_ss
## # A tibble: 20 × 14
##       TP    FN    FP    TN detection_limit compound time_window   NAs     N
##    <dbl> <dbl> <int> <int>           <dbl> <chr>    <fct>       <int> <int>
##  1     0     0     0   185             0.5 thc      pre-smoking   537   722
##  2     0     0     0   185             1   thc      pre-smoking   537   722
##  3     0     0     0   185             5   thc      pre-smoking   537   722
##  4     0     0     0   185             7.5 thc      pre-smoking   537   722
##  5     0     0     0   185            10   thc      pre-smoking   537   722
##  6     0     2     0    48             0.5 thc      0-60 min      672   722
##  7     0     2     0    48             1   thc      0-60 min      672   722
##  8     0     2     0    48             5   thc      0-60 min      672   722
##  9     0     2     0    48             7.5 thc      0-60 min      672   722
## 10     0     2     0    48            10   thc      0-60 min      672   722
## 11     0    85     0    61             0.5 thc      0-120 min     576   722
## 12     0    85     0    61             1   thc      0-120 min     576   722
## 13     0    85     0    61             5   thc      0-120 min     576   722
## 14     0    85     0    61             7.5 thc      0-120 min     576   722
## 15     0    85     0    61            10   thc      0-120 min     576   722
## 16     0   210     0   114             0.5 thc      181+ min      398   722
## 17     0   210     0   114             1   thc      181+ min      398   722
## 18     0   210     0   114             5   thc      181+ min      398   722
## 19     0   210     0   114             7.5 thc      181+ min      398   722
## 20     0   210     0   114            10   thc      181+ min      398   722
## # ℹ 5 more variables: Sensitivity <dbl>, Specificity <dbl>, PPV <dbl>,
## #   NPV <dbl>, Efficiency <dbl>
plot_cutoffs(dataset=BR_ss, 
             timepoint_use_variable=BR$timepoint, 
             tissue="Breath", 
             cpd="thc")

Compared to the Whole Blood and Oral Fluid, Breath seems to be only applicable for higher detection limits, which in this case is 10 ng/mL. And since we have already found that 5 ng/mL is already accurate in detecting with high specificity and sensitivity and we would like the lower dosage possible for detection, we can assess that the breath matrix is unnecessary for our study and has use for larger doses in the first three hours.

# specify which calculations to make
cutoffs <- c(0.1, 5, 5.5, 6, 6.5)
compounds <- combined_user_long |> filter(fluid_type=="WB") |> filter(!is.na(value)) |> distinct(name) |> pull(name)
WB_timepoints <- c("pre-smoking", "0-60 min", "0-120 min", "181+ min")
WB <- combined |> filter(fluid_type=="WB")

# Specify all parameter combinations
param_grid <- expand.grid(
  cutoffs = cutoffs,
  compounds = compounds,
  timepoint_use = WB_timepoints)


# Calculate for all cutoff-compound-timepoint combinations
WB_new_ss <- purrr::pmap_dfr(param_grid, ~ make_calculations(dataset=WB, cutoff = ..1, compound = as.character(..2), timepoint_use = ..3))

WB_new_ss
## # A tibble: 160 × 14
##       TP    FN    FP    TN detection_limit compound time_window   NAs     N
##    <dbl> <dbl> <int> <int>           <dbl> <chr>    <fct>       <int> <int>
##  1     0     0     1   185             0.1 cbn      pre-smoking  1119  1305
##  2     0     0     0   186             5   cbn      pre-smoking  1119  1305
##  3     0     0     0   186             5.5 cbn      pre-smoking  1119  1305
##  4     0     0     0   186             6   cbn      pre-smoking  1119  1305
##  5     0     0     0   186             6.5 cbn      pre-smoking  1119  1305
##  6     0     0     4   182             0.1 cbd      pre-smoking  1119  1305
##  7     0     0     0   186             5   cbd      pre-smoking  1119  1305
##  8     0     0     0   186             5.5 cbd      pre-smoking  1119  1305
##  9     0     0     0   186             6   cbd      pre-smoking  1119  1305
## 10     0     0     0   186             6.5 cbd      pre-smoking  1119  1305
## # ℹ 150 more rows
## # ℹ 5 more variables: Sensitivity <dbl>, Specificity <dbl>, PPV <dbl>,
## #   NPV <dbl>, Efficiency <dbl>
plot_cutoffs(dataset=WB_new_ss, 
             timepoint_use_variable=WB$timepoint, 
             tissue="Blood", 
             cpd="thc")

In context for cutoffs between 5 ng/mL -7 ng/mL, it has shown that higher detection limits do often become less sensitive to accuracy overtime, leading to a drop in likelihood to identify if THC has been recently used. The cutoffs in the middle have the most likelihood to be optimal in detecting THC in Blood that would be less sensitive to small traces of THC in the system that may not be causal for impairment, but is more likely to pick up a concentration of THC metabolites that is sustained in Blood for longer overtime.

Results & Discussion

The context of our question focuses on 1. Which matrix–blood, oral fluid, or breath–is most effective for detecting recent THC use within a 3-hour window? 2. What is the optimal cutoff within this 3-hour timeframe for detecting THC in each matrix? 3. How does the treatment group (Placebo, 5.9% THC, or 13.4% THC) affect the detection of recent THC use in each matrix?

Conclusion

[^4] Kosnett, M. J., Ma, M., Dooley, G., Wang, G. S., Friedman, K., Brown, T., Henthorn, T. K., & Brooks-Russell, A. (2023). Blood cannabinoid molar metabolite ratios are superior to blood THC as an indicator of recent cannabis smoking. Clinical Toxicology, 61(5), 355–362. https://doi.org/10.1080/15563650.2023.2214697


  1. Pearson, J., et al. (2020). Legal implications of THC per se limits. Journal of Accident Analysis & Prevention. Retrieved from Taylor & Francis Online.↩︎

  2. Huestis, M. A. (2007). Human cannabinoid pharmacokinetics. Chemistry & Biodiversity, 4(8), 1770-1804. [https://pmc.ncbi.nlm.nih.gov/articles/PMC2689518/).↩︎

  3. Compton, R. (2017). Marijuana-impaired driving: A report to Congress. National Highway Traffic Safety Administration. Retrieved from NGTSA.↩︎